Fix float 16 for cuda collectives#414
Conversation
| case GA_DOUBLE: return ncclDouble; | ||
| case GA_LONG: return ncclInt64; | ||
| case GA_ULONG: return ncclUint64; | ||
| #ifdef CUDA_HAS_HALF |
There was a problem hiding this comment.
I'm not sure what to do if CUDA_HAS_HALF isn't defined. I think it should always be defined as we only support recent enough cuda version. So I would just remove the ifdef. Do you agree with that?
otherwise, it look good. thanks.
There was a problem hiding this comment.
This is defined in nccl.h and is enabled if there is compatibility. Should I remove it?
There was a problem hiding this comment.
I remember that, but I think GA_HALF is always defined. So this cause a problem if libgpuarray support it while nccl don't. Here we don't give a good error message. Here, we request cuda 7 or more recent:
http://deeplearning.net/software/libgpuarray/installation.html#run-requirements
And CUDA_HAS_HALF is always true in that case. So I would remove the ifdef.
There was a problem hiding this comment.
Yes don't use an ifdef switch since we load the libraries dynamically and we don't want to disable it for future loads.
|
I pushed the small change to this PR. If travis pass, I'll merge. thanks |
| case GA_LONG: return ncclInt64; | ||
| case GA_ULONG: return ncclUint64; | ||
| case GA_HALF: return ncclHalf; | ||
| case GA_FLOAT16: return ncclHalf; |
There was a problem hiding this comment.
GA_FLOAT16 means a vector of 16 floats. This is horribly inaccurate.
There was a problem hiding this comment.
I had asked on #411 about the types. Excuse me for the inconvenience.
#411